Orthogonal Nonnegative Matrix Tri-factorization for Semi-supervised Document Co-clustering
نویسندگان
چکیده
Semi-supervised clustering is often viewed as using labeled data to aid the clustering process. However, existing algorithms fail to consider dual constraints between data points (e.g. documents) and features (e.g. words). To address this problem, in this paper, we propose a novel semi-supervised document co-clustering model OSS-NMF via orthogonal nonnegative matrix tri-factorization. Our model incorporates prior knowledge both on document and word side to aid the new wordcategory and document-cluster matrices construction. Besides, we prove the correctness and convergence of our model to demonstrate its mathematical rigorous. Our experimental evaluations show that the proposed document clustering model presents remarkable performance improvements with certain constraints.
منابع مشابه
A Projected Alternating Least square Approach for Computation of Nonnegative Matrix Factorization
Nonnegative matrix factorization (NMF) is a common method in data mining that have been used in different applications as a dimension reduction, classification or clustering method. Methods in alternating least square (ALS) approach usually used to solve this non-convex minimization problem. At each step of ALS algorithms two convex least square problems should be solved, which causes high com...
متن کاملOrthogonal Nonnegative Matrix Factorization for Multi-type Relational Clustering
Relational clustering with heterogeneous data objects has impact in various important applications, such as web mining, text mining and bioinformatics etc. In this paper, we build a star-structured general model for relational clustering. It is formulated as an orthogonal tri-nonnegative matrix factorization. The model performs matrix approximation among all different data types to look for hid...
متن کاملNonnegative Matrix Factorization with Orthogonality Constraints
Nonnegative matrix factorization (NMF) is a popular method for multivariate analysis of nonnegative data, the goal of which is to decompose a data matrix into a product of two factor matrices with all entries in factor matrices restricted to be nonnegative. NMF was shown to be useful in a task of clustering (especially document clustering), but in some cases NMF produces the results inappropria...
متن کاملOrthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds
Nonnegative matrix factorization (NMF) is a popular method for multivariate analysis of nonnegative data, the goal of which is decompose a data matrix into a product of two factor matrices with all entries in factor matrices restricted to be nonnegative. NMF was shown to be useful in a task of clustering (especially document clustering). In this paper we present an algorithm for orthogonal nonn...
متن کاملSentiment Classification with Graph Co-Regularization
Sentiment classification aims to automatically predict sentiment polarity (e.g., positive or negative) of user-generated sentiment data (e.g., reviews, blogs). To obtain sentiment classification with high accuracy, supervised techniques require a large amount of manually labeled data. The labeling work can be time-consuming and expensive, which makes unsupervised (or semisupervised) sentiment a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010